GPUTeraSort: High Performance Graphics Coprocessor Sorting for Large Database Management
نویسنده
چکیده
Sorting large datasets is often limited by I/O bandwidth in terms of memory and disk. The traditional von Neumann architecture results in high cache misses within L1-L3 cache levels. The authors realized that the highly parallelized processors and the fast memory interconnects inside commodity GPUs can help to work around the limitations that arise when sorting is done solely on the CPU. Specifically, the limitation is that most sorting performance is limited by cache misses. Although main memory sizes have gotten larger, so that disk I/O is reduced, memory access itself is a bottleneck. Utilizing the GPU is advantageous in that it may have 10 times the bandwidth for memory that a CPU has.
منابع مشابه
Sorting with GPUs: A Survey
Sorting is a fundamental operation in computer science and is a bottleneck in many important fields. Sorting is critical to database applications, online search and indexing, biomedical computing, and many other applications. The explosive growth in computational power and availability of GPU coprocessors has allowed sort operations on GPUs to be done much faster than any equivalently priced CP...
متن کاملFast In-Place Sorting with CUDA Based on Bitonic Sort
State of the art graphics processors provide high processing power and furthermore, the high programmability of GPUs offered by frameworks like CUDA increases their usability as high-performance coprocessors for general-purpose computing. Sorting is well-investigated in Computer Science in general, but (because of this new field of application for GPUs) there is a demand for high-performance pa...
متن کاملLoad-aware inter-co-processor parallelism in database query processing
For a decade, the database community has been exploring graphics processing units and other co-processors to accelerate query processing. While the developed algorithms often outperform their CPU counterparts, it is not beneficial to keep processing devices idle while over utilizing others. Therefore, an approach is needed that efficiently distributes a workload on available (co-)processors whi...
متن کاملGPU-based Sorting in PostgreSQL
Although the number of transistors per microprocessor core has increased over the last decade, this increase in complexity has caused only a modest increase in application performance. Most performance improvements can be traced to faster clock rates from technology scaling. While complex, wide-issue, superscalar cores provide higher single-thread performance, they are often poorly suited to ap...
متن کاملParallel Sorting on GPU Clusters
It is becoming more common to install modern graphics cards on small to medium size commodity clusters. In addition to applications such as display walls and CAVE environments, graphics cards can be used as dedicated coprocessors that can run certain parallel algorithms very quickly. Sorting has been long recognized as an important algorithm in terms of both mathematical analysis and a way to j...
متن کامل